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Abstract A major feature of the emerging geo-social 
networks is the ability to notify a user when any of his 
friends (also called buddies) happens to be geographi- 
cally in proximity. This proximity service is usually of- 
fered by the network itself or by a third party service 
provider (SP) using location data acquired from the 
users. This paper provides a rigorous theoretical and 
experimental analysis of the existing solutions for the 
location privacy problem in proximity services. This is 
a serious problem for users who do not trust the SP 
to handle their location data, and would only like to 
release their location information in a generalized form 
to participating buddies. The paper presents two new 
protocols providing complete privacy with respect to 
the SP, and controllable privacy with respect to the 
buddies. The analytical and experimental analysis of 
the protocols takes into account privacy, service preci- 
sion, and computation and communication costs, show- 
ing the superiority of the new protocols compared to 
those appeared in the literature to date. The proposed 
protocols have also been tested in a full system imple- 
mentation of the proximity service. 
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1 Introduction 

A geo-social network is an extension of a social network 
in which the geographical positions of participants and 
of relevant resources are used to enable new information 
services. These networks are mostly motivated by the 
increased availability of GPS-enabled mobile devices 
that support both Location-Based Services (LBSs), and 
easy access to the current social networks. 

As in most social networks, each user has a contact 
list oi friends, also called buddies. A basic service in geo- 
social networks is the proximity service that alerts the 
user when any of her buddies is in the vicinity, possibly 
enacting other activities like visualizing the buddy's po- 
sition on a map, or activating a communication session 
with the buddy. Such proximity services, often called 
friend finder, are already available as part of geo-social 
networks (e.g., Brightkita^, as part of a suite of map 
and navigation services (e.g., Google Latitud^, or as 
an independent service that can be integrated with so- 
cial networks (e.g., LoopirJ. 

From a data management point of view, a proxim- 
ity service involves the computation of a range query 
over a set of moving entities issued by a moving user, 
where the range is a distance threshold value decided 
by the user. All existing services are based on a central- 
ized architecture in which location updates, issued from 
mobile devices, are acquired by the SP, and proximity 
is computed based on the acquired locations. 



^ http : //br Ightklte . com 



^ http://www.google.com/latitud6 
•^ http://www.loopt.com 



Privacy threats in LBS 



The location privacy problem in proximity services 



While proximity services are very attractive for many 
social network users, the repeated release of informa- 
tion about where the user is at a given time raises se- 
vere privacy concerns. This is an issue that has been 
deeply investigated in the last years for general LBSs, 
even if no general consensus has been reached about 
how the privacy problem should be defined, measured 
and, consequently, alleviated. For this reason we briefly 
illustrate the general problem before describing our ap- 
proach. 

The lack of agreement observed in the literature is 
mainly due to the dual role that location information 
plays in LBS privacy. On one side, location is considered 
the private data that a user does not want to disclose, 
because it may be itself sensitive information, or be- 
cause it may lead to disclosure of sensitive information. 
For example, by knowing that a user is in a synagogue 
during an important religious ceremony, an adversary 
may infer, with a certain probability, the user's religious 
belief, which may be considered a privacy violation by 
this user. On the other side, location information may 
act as a quasi- identifier, i.e., when this information is 
joined with external data it may compromise the user's 
anonymity, and hence allow an adversary to associate 
the user's identity with the sensitive information related 
to the service. For example, suppose a user subscribes 
to a location-based dating service using a pseudonym; 
even if the locations released to the service are not 
considered sensitive by her, her identity can be recov- 
ered by first deriving, from her trace of movements, her 
home and workplace addresses and then joining these 
addresses with public data, like a telephone directory. 
In this way, the adversary can deduce the identity of 
the dating service user, a privacy violation. 

Since the specific position of a user at a given time 
can play either roles illustrated above, two different pri- 
vacy notions have appeared in the LBS literature: a) 
location privacy which assumes that untrusted parties 
may know the user's identity but not the user's loca- 
tion, or at least not the user's precise location, which 
is considered sensitive and has to be protected [TSII^ 
I181I11J , and b) identity privacy in which the anonymity 
of the user must be preserved by avoiding the (pre- 
cise or imprecise) location information being used as 
a quasi-identifier [T0l[T2l ir7l[7] . Techniques adopted for 
the second notion (e.g., spatial cloaking to include at 
least k users in the released location) do not neces- 
sarily provide privacy guarantees for the first notion, 
and vice versa. In Section [2] we shortly review the main 
techniques applicable to proximity services, including 
approaches trying to address both privacy notions. 



In this paper we consider geo-social networks proximity 
services in which a user usually knows the identity of 
her buddies, or may easily discover it. In this context, 
identity privacy is not an issue, since the anonymity of 
the buddies is not assumed. For this reason, the prob- 
lem we address is a location privacy preservation, i.e., 
the first notion accordingly to the above discussion. We 
assume that both SP and buddies are considered as po- 
tential adversaries, that is, a) the users do not trust the 
service provider that will handle their (sensitive) loca- 
tion data, and b) the users would like to control the 
precision of the location data released to their buddies. 

The above assumption of limited trust is formalized 
in terms of privacy preferences. Regarding a), we make 
the strong requirement that SP should not acquire any 
location information about the users; regarding b), each 
user can specify the finest precision of the location infor- 
mation that can be disclosed to her buddies, where the 
precision is in terms of a spatial region of uncertainty 
containing the current location. For example, user Al- 
ice allows her buddy Bob to be notified when she is 
in proximity, but she wants a) to hide completely her 
location to the SP, and b) to ensure that whatever prox- 
imity threshold (i.e., the radius of the proximity query) 
Bob is using, he cannot understand where exactly Alice 
is located within a region decided by herself (e.g., the 
whole university campus). 

Existing proximity services do not offer any protec- 
tion regarding point a) above other than legal privacy 
policy statements, and they offer a very limited control 
regarding point b); for example, some solutions allow 
the user to limit the location released to the buddies 
to the precision level of city. Preliminary studies of this 
problem have appeared in the academic literature [501 
fT8lf26llT6l[25] , and are analysed in detail in Section [21 
but the solutions provided in these studies have limita- 
tions either in terms of safety, system costs, or in terms 
of flexibility in the specification of user preferences and 
adversary model. 

Contribution 

The main contributions of this paper are the following. 

i) This is the first comprehensive rigorous study of lo- 
cation privacy in proximity services, explicitly taking 
into account privacy control with respect to buddies. 

ii) Two new location privacy preserving protocols are 
designed, formally analyzed, and empirically tested, show- 
ing their superiority with respect to existing solutions. 

We formally model the privacy preferences that each 



user can specify as well as the properties that a set of 
messages, exchanged between a user and the SP, should 
have in order to satisfy the privacy preferences. To the 
best of our knowledge, the formal model proposed in 
this paper is the first one to consider adversaries hav- 
ing a-priori probabilistic knowledge of users' location. 
For example, it is possible to model the presence of com- 
mon knowledge, like the fact that a user is more likely 
to be located in the country where she lives rather than 
in a foreign one. 

The formal model is used to prove that the proto- 
cols proposed in this paper guarantee location privacy 
protection. Intuitively, we prove that, for each possible 
a-priori probabilistic knowledge of users' location, each 
set of messages exchanged between a user and the SP 
satisfies the user's privacy preferences. 

Our theoretical and experimental study shows that, 
in addition to privacy protection, the proposed proto- 
cols offer other two main advantages: they have sustain- 
able communication and computation costs and they 
have low impact on the quality of service. In order to 
tackle the problem of system costs, we adopt a cen- 
tralized architecture. This solution, not only supports 
current business models, but, by reducing the commu- 
nication and computation costs on the clients, it is also 
more appropriate for a proximity service with respect 
to a decentralized architecture like the ones proposed in 
previous works J30|.ll8j . For what concerns the quality of 
service, the performance of the proposed protocols can 
be controlled by user's preferences. Indeed, differently 
from previous solutions [^Dll26| . we allow each user to 
specify one parameter for the proximity threshold and 
a different parameter for the privacy preference. In ad- 
dition to be more flexible than existing approaches, our 
techniques provide much higher quality of service. 

The two protocols shown in this paper present the 
same service precision but differ in the trade-off be- 
tween privacy guarantees and system costs. Indeed, the 
first protocol, called C-Hide&Seel^ is shown to provide 
complete protection with respect to the SP, and to sat- 
isfy the privacy requirements of each user with respect 
to her buddies. Its efficiency is comparable with the 
simplistic solution adopted in current services for prox- 
imity computation that provides no privacy protection. 
The second protocol, called C-Hide&Hash, offers the 
same guarantees, but provides an even higher level of 
privacy with respect to the buddies at the cost of higher 
communication and computation costs. 

The rest of the paper is organized as follows. In Sec- 
tion[2]we discuss related work. In Section|3]we describe 
more formally the problem we are addressing in terms of 
privacy concerns, privacy requirements, and adversary 

** C stands for centralized 



models. In Section|4]we illustrate the two proposed pro- 
tocols, and in Section [5] we study their formal proper- 
ties, including the satisfaction of privacy requirements, 
the computational and communication costs, and the 
service precision. In Section [6] we describe the system 
implementation, and in Section [7] we report experimen- 
tal results. Section [8] concludes the paper with a discus- 
sion of possible extensions. 



2 Related Work 

As mentioned in Section [Tl there are three main ap- 
proaches for privacy preservation in LBS based on whether 
they deal with a) identity privacy, b) location privacy, 

we 



or c) a combination of these two. In Section 2.1 



first discuss the applicability to our reference scenario 
of the approaches that deal with (b). We discuss the 



approaches that deal with (c) in Section 2.2 Then, in 



Section [273} we focus on the existing contributions that 
are specifically related to proximity services. An exten- 
sive survey of privacy preserving techniques in LBS can 
be found in P]. 

For what concerns approaches that deal with (a), 
i.e., identity privacy only, the main technique is based 
on the application of the fc-anonymity principle |23| 
to LBS, and was proposed by Gruteser et al. [TD]. k- 
anonymity is achieved by ensuring that the user's loca- 
tion sent to the SP as part of the request is a region suf- 
ficiently large to include, in addition to the issuer, other 
fc — 1 potential issuers. Several variants of this idea and 
different algorithms for spatial cloaking have been pro- 
posed. For example, Gedik et at. [7] illustrate a cloaking 
algorithm allowing each user to choose a personalized 
value of fc, while others propose algorithms proved to 
be safe also in case the adversary knows the defense 
function P^llTj . As mentioned in Section [l] these tech- 
niques are not applicable in the reference scenario con- 
sidered in this paper, since we consider services in which 
users may not be anonymous, independently from loca- 
tion information. In Section[8]we briefly discuss how the 
techniques we propose in this paper can be extended 
to provide identity privacy for those proximity services 
that require anonymity. 



2.1 Location privacy protection 

The intuition behind location privacy (i.e., the first pri- 
vacy notion given in Section [l]) is that users perceive 
their location as private information. However, they 
may tolerate that some location information is disclosed 
if it is sufficiently unlikely that the adversary discovers 



their precise location. To achieve this result, techniques 
based on different ideas have been proposed. 

One idea is to send requests from fake locations to- 
gether with the request from the real user's location 
(e.g., |15|V The main problem with the techniques im- 
plementing this idea is that a large number of fake re- 
quest is necessary in order to guarantee privacy protec- 
tion, while the system costs grow linearly in the number 
of fake requests. 

Another solution consists in sending a request (e.g., 
a X-NN query) from a fake location and incrementally 
retrieve results (e.g., NN resources) from the SP until 
the client can reconstruct the result to the query cen- 
tered in the real user's location |5Sj. Privacy is guar- 
anteed because the SP can only discover that the user 
is located within a region without learning the exact 
location. The distance between the real user's location 
and the fake location used in the request determines 
a trade-off between privacy and performance. Indeed, 
if the distance is large, the size of region discovered 
by the SP is also large, but this results in high sys- 
tem costs. These techniques have been applied mostly 
for LBS performing fc-NN spatial queries, and do not 
apply to proximity detection. 

A third family of techniques to enforce location pri- 
vacy is based on the idea of enlarging the user's precise 
location before it is sent to the SP to a generalized re- 
gion in order to decrease its sensitivity (among others, 
[l8l[25l [8]). Some of these techniques are specifically de- 
signed for proximity services and we discuss them in 
details in Section |2.3[ The main technical problem is 
how to process spatial queries in which the parame- 
ters are generalized regions instead of exact locations. 
On the other hand, the advantage is that the general- 
ized region can be specified as a user preference before 
any information is sent by the client. Indeed, this is 
the solution we adopt in this paper to protect a user's 
privacy with respect to her buddies. We actually prove 
that when a user specifies a generalized region, her bud- 
dies do not acquire any location information about that 
user, except the fact that she is inside the generalized 
region. 



2.2 Identity and location privacy protection combined 

Some of the solutions proposed in the literature aim at 
providing both location privacy and identity privacy. 
For example, in Casper [19) . users' locations are gener- 
alized to a region that contains at least k users and that 
has an area not smaller than a user-specified threshold. 
The problem with this solution is that it is insecure in 
case the adversary knows the generalization technique. 



Another similar technique based on that of j28] is re- 
ported in [27] to tackle both privacy notions with one 
algorithm. 

Other solutions providing both location and iden- 
tity privacy are inspired to private information retrieval 
(PIR) methods. The idea is to encrypt the information 
exchanged with the SP, and to process the correspond- 
ing query in an encrypted form, so that no location 
information is revealed to the SP. The techniques pro- 
posed in [9|[20] are specifically designed for NN queries, 
while [14] considers range queries over static resources, 
which is still not the proper setting for proximity de- 
tection. Khoshgozaran et al. [13] propose a system to 
maintain an encrypted index on the server side and ef- 
ficiently update it, which makes it suitable for main- 
taining a database of moving buddies. The system sup- 
ports encrypted range and fc-NN spatial queries, hence 
it could be used to offer proximity based services. How- 
ever, the system requires users to be organized in groups, 
with each group sharing a symmetric secret key, and all 
the users in a group must trust each other. Furthermore, 
the proposed techniques for retrieving the query results 
seem to be vulnerable to cardinality attacks [20] , if the 
SP has a-priori knowledge about the distribution of the 
users. 

These encryption-based techniques guarantee loca- 
tion and identity privacy because no location informa- 
tion is disclosed. Consequently, if it is assumed that 
users are anonymous, then identity privacy is also guar- 
anteed, since no location information can be used to 
re-identify the user. Considering location privacy, these 
techniques provide the same protection as a solution 
based on location-enlargement in which the user's loca- 
tion is generalized to the entire world, i.e., they provide 
the maximum privacy protection. For this reason, in 
this paper we adopt this solution to guarantee privacy 
with respect to the SP. 



2.3 Location privacy protection in proximity services 

Computing proximity involves the continuous evalua- 
tion of spatial range queries over a set of moving en- 
tities, with a dynamic radius range [51IM]. The litera- 
ture on this problem is both from the database, and 
the mobile computing community; recent contributions 
are briefly surveyed in Ij, where an efficient algorithm 
for proximity detection named Strips is presented. The 
goal of this and similar approaches (e.g., [29]) is the 
efficiency in terms of computation and communication 
complexity, while privacy issues are mostly ignored. 

Ruppel et al. [5T] propose a technique for privacy 
preserving proximity computation based on the appli- 
cation of a distance preserving transformation on the 



location of the users. The problem with this solution is 
that the SP is able to obtain the exact distances be- 
tween users, and this can lead to a privacy violation. 
For example, by using this knowledge, it is possible to 
construct a weighted graph of all the users, assigning 
to each edge connecting two users their exact distance. 
It is easily seen that a "relative" distribution of the 
user locations can be extracted from this graph. If the 
SP has a-priori knowledge about the distribution of the 
users (as considered in our paper), it is possible to com- 
bine the distribution resulting from the graph with the 
a-priori one, thus revealing some location information 
about the individuals. In addition, there is no privacy 
guarantee with respect to the other users participating 
in the service. The solutions we propose in this paper do 
not reveal to the SP any information about the distance 
between users, and let users define the privacy require- 
ment about the location information that buddies can 
acquire. 

Zhong et al. propose three different techniques for 
privacy preservation in proximity-based services called 
Louis, Lester and Pierre [30j. These techniques are de- 
centralized secure computation protocols based on public- 
key cryptography. Louis is a three-parties secure com- 
putation protocol. By running this protocol, a user A 
gets to know whether another user B is in proximity 
without disclosing any other location information to B 
or to the third party T involved in the protocol. T only 
helps A and B compute their proximity, and it is as- 
sumed to follow the protocol and not to collude with A 
or B. However, T learns whether A and B are in prox- 
imity. Considering our adversary model, which will be 
explained in detail in Section 3.3, this third party can- 
not be the SP that may use proximity information to 
violate location privacy, and it is unlikely to be played 
by a third buddy since it would involve significant re- 
sources. The Lester protocol allows a user A to compute 
the exact distance from a user B only if the distance be- 
tween the two users is under a certain threshold chosen 
by B. The main advantage of these two techniques is 
that they protect a user's privacy without introducing 
any approximation in the computation of the proxim- 
ity. However, Louis incurs in significant communication 
overheads, and Lester in high computational costs. In 
addition, the only form of supported privacy protec- 
tion with respect to the buddies is the possibility for 
a user to refuse to participate in the protocol initiated 
by a buddy if she considers the requested proximity 
threshold too small. The Pierre protocol partitions the 
plane where the service is provided into a grid, with 
each cell having edge equal to the requested distance 
threshold. The locations of the users are then general- 
ized to the corresponding cell, and two users are con- 



sidered in proximity if they are located in the same cell 
or in two adjacent cells. The achieved quality of service 
decreases as the requested proximity threshold grows. 
We will explain in more detail the actual impact on 
service precision in Section [7| Finally, it should be ob- 
served that Lester and Pierre protocols are based on a 
buddy-to-buddy communication, and although this can 
guarantee total privacy with respect to the SP (as no 
SP is involved in the computation), scalability issues 
may arise since each time a user moves she needs to 
communicate her new position to each of her buddies. 

Another solution for privacy preserving computa- 
tion of proximity, called FriendLocator, has been pro- 
posed by Siksnys et al. |26j . Similarly to Pierre, two 
users are considered in proximity when they are located 
in the same cell or two adjacent cells of the grid con- 
structed considering the proximity threshold shared by 
the users. An interesting aspect of the proposed solu- 
tion is the location update strategy, which is designed to 
reduce the total number of location updates to be sent 
by the users, hence reducing communication costs. Two 
users share a hierarchy of grids, where a level identifies 
each grid. The larger the value of the level is, the finer 
the grid. The highest level grid is the one in which the 
edge of a cell is equal to the proximity threshold. The 
detection of proximity is then incremental, i.e. if two 
users are in adjacent cells at the level n grid, then their 
respective cells in the grid of level n + 1 are checked, 
until they are detected either not to be in proximity, 
or to be in proximity considering the highest level grid. 
With this solution, when two users are detected not to 
be in proximity at a certain level I, there is no need for 
them to check again the proximity until one of them 
moves to a different cell of the level I grid. As a con- 
sequence, less location updates are needed, and this is 
experimentally shown to significantly reduce the total 
number of messages exchanged. However, the Friend- 
Locator protocol reveals some approximate informa- 
tion about the distance of users to the SP (e.g. the level 
in which the incremental proximity detection protocol 
terminates and whether the buddies are in proximity 
at that level). As already observed for the Louis proto- 
col, in our adversary model this information can lead 
to a privacy violation. Furthermore, the impact on the 
quality of service of using a large proximity threshold 
is identical to the Pierre protocol discussed above. 

A more recent solution by the same authors [25], 
called VicinityLocator, solves this problem by let- 
ting users specify their privacy preferences as spatial 



granularities (see Section 3.2), independently from the 



requested proximity threshold. A similar location up- 
date strategy is employed to minimize the communica- 
tion costs. However, similarly to FriendLocator, the 



SP learns some information about the distance among 
the users, and this could lead to a privacy violation in 
our adversary model. 

In previous work [TC1[T5] , we proposed different pro- 
tocols for preserving privacy in proximity services. The 
Longitude solution [16^ translates the considered space 
to a toroid, and a distance preserving transformation is 
applied to the locations of users. The SP participates in 
a form of three party secure computation of proximity, 
leading to an approximate but quite accurate service 
precision, guaranteeing privacy requirements with re- 
spect to buddies similar to the ones presented in this 
paper. Longitude also guarantees complete privacy with 
respect to the SP under the assumption that he has 
no a-priori knowledge on the distribution of users, i.e., 
when a uniform distribution is assumed. In this paper 
we defend also against SP having arbitrary a-priori dis- 
tributions, showing that by running our protocols they 
don't acquire any additional location information. The 
Hide&Seek and Hide&Crypt protocols [18] are hybrid 
techniques in which the SP performs an initial compu- 
tation of the proximity. In some cases, the SP is not 
able to decide whether two users are in proximity, and 
a buddy-to-buddy protocol is triggered. An important 
difference with respect to the protocols we are present- 
ing here is that the SP is not totally untrusted: users 
can specify a level of location precision to be released to 
the SP and (a different one) for buddies. This hybrid ap- 
proach significantly reduces communication costs with 
respect to decentralized solutions when privacy require- 
ments with respect to the SP are not too strict. 



3 Problem formalization 

In this section we formally define the service we are con- 
sidering, the users' privacy concerns and requirements, 
the adversary model, and the occurrence of a privacy 
violation. 



3.1 The proximity service 

By issuing a proximity request, user A is interested to 
know, for each of her buddies B, if the following condi- 
tion is satisfied: 



d{locA, Iocb) < 5a 



(1) 



where d{locA, locg) denotes the Euclidean distance be- 
tween the reported locations of A and B and Sa is a 
threshold value given by A. When (IT]) is true, we say 
that B is in the proximity of A. The proximity relation 
is not symmetric, since 5b may be different from 5a, 



In this paper we consider services in which the bud- 
dies of a user are pre-determined. We call these services 
"contact-list-based" , since buddies are explicitly added 
as "friends" , like in most social networks and instant 
messaging applications. This is in contrast to "query- 
driven" proximity services, in which buddies can be re- 
trieved through a query based, for example, on the in- 
terests of the buddies. Technically, the main difference 
is that in the "contact-list-based" service it is reason- 
able to assume that each user can share a secret with 
each of her buddies, as we do in our proposed tech- 
niques. On the contrary, in the case of "query-driven" 
services, the set of buddies may change dynamically, 
and the number of buddies can be potentially very large. 
In this situation, it may not be practical to share a se- 
cret with each buddy. 

With the presence of a service provider (SP), and in 
absence of privacy concerns, a simple protocol can be 
devised to implement the proximity service; The SP re- 
ceives location updates from each user and stores their 
last known positions, as well as the distance threshold 
5a for each user A. While in theory each user can define 
different threshold values for different buddies, in this 
paper, for simplicity, we consider the case in which each 
user A defines a single value 6a for detecting the prox- 
imity of all of her buddies. When the SP receives a loca- 
tion update, it can recompute the distance between A 
and each buddy (possibly with some filtering/indexing 
strategy for efficiency) and communicate the result to 
A. In a typical scenario, if B is in proximity, A may 
contact him directly or through the SP; however, for 
the purpose of this paper, we do not concern ourselves 
as what A will do once notified. In the following of this 
paper we refer to the above protocol as the Naive pro- 
tocol. 



3.2 Privacy concerns and privacy requirements 

The privacy we are considering in this paper is location 
privacy: we assume that a user is concerned about the 
uncontrolled disclosure of her location information at 
specific times. 

Considering the Naive protocol, it is easily seen that 
the SP obtains the exact location of a user each time she 
issues a location update. Furthermore, a user's location 
information is also disclosed to her buddies. If Alice is 
in the proximity of Bob (one of her buddies), then Bob 
discovers that Alice is located in the circle centered in 
his location with radius 5 Bob- Since 5 Bob is chosen by 
Bob and can be set arbitrarily without consent from 
Alice, Alice has no control on the location information 
disclosed to Bob. 



Our definition of location privacy is based on tlie 
idea that the users should be able to control the location 
information to be disclosed. In the considered services, 
a user may prefer the service provider to have as lit- 
tle information about her location as possible, and the 
buddies not to know her exact position, even when the 
proximity is known to them. Moreover, the exchanged 
information should be protected from any eavesdrop- 
per. 

In general, the level of location privacy can be rep- 
resented by the uncertainty that an external entity has 
about the position of the user. This uncertainty is a 
geographic region, called minimal uncertainty region 
(MUR) , and its intuitive semantics is the following: the 
user accepts that the adversary knows she is located 
in a MUR R, but no information should be disclosed 
about her position within R. 

In the solution proposed in this paper, each user can 
express her privacy preferences by specifying a parti- 
tion of the geographical space defining the MURs that 
she wants guaranteed. For example, Alice specifies that 
her buddies should never be able to find out the spe- 
cific campus building where Alice currently is; in this 
case, the entire campus area is the minimal uncertainty 
region. The totality of these uncertainty regions for a 
user can be formally captured with the notion of spatial 
granularity. 

While there does not exist a formal definition of 
spatial granularity that is widely accepted by the re- 
search community, the idea behind this concept is sim- 
ple. Similar to a temporal granularity [7, a spatial gran- 
ularity can be considered a subdivision of the spatial 
domain into a discrete number of non-overlapping re- 
gions, called granules. In this paper, for simplicity, we 
consider only granularitie^ that partition the spatial 
domain, i.e., the granules of a granularity do not inter- 
sect and the union of all the granules in a granularity 
yields exactly the whole spatial domain. Each granule 
of a granularity G is identified by an index (or a label). 
We denote with G{i) the granule of the granularity G 
with index i. 

Users specify their privacy requirements via spatial 
granularities, with each granule being a MUR. The two 
extreme cases in which a user requires no privacy pro- 
tection and maximum privacy protection, respectively, 
can be naturally modeled. In one extreme case, if a 
user A does not want her privacy to be protected then 
A sets her privacy preference to the bottom granularity 
_L (a granularity that contains a granule for each basic 
element, or pixel, of the spatial domain). In the other 
extreme, if user A wants complete location privacy then 

^ Here and in the following, when no confusion arises, we use 
the term "granularity" to mean "spatial granularity". 



she sets her privacy preference to the top granularity T, 
i.e., the granularity that has a single granule covering 
the entire spatial domain. In this case, A wants the en- 
tire spatial domain as MUR. 

In the following of this paper, we assume that each 
user A specifies a granularity Ga defining her location 
privacy requirements with respect to all buddies. Our 
approach can be easily extended to model the case in 
which a user specifies a different granularity for a dif- 
ferent buddy or for a different group of buddies, as dis- 
cussed in Section |8l We also assume that each user's 
privacy requirement with respect to the SP is the entire 
spatial domain, i.e., the user does not want to disclose 
any location information to the SP. 



3.3 Adversary model and privacy preservation 

We consider two adversary models, for the SP and the 
buddies, respectively. Assuming the SP and the bud- 
dies as potential adversaries, also models other types of 
adversaries. Firstly, it models the case of an external 
entity taking control of the SP system or of a buddy's 
system. Secondly, it models the case of an external en- 
tity eavesdropping one or more communication chan- 
nels between users and the SP. Note that, in the worst 
case, the eavesdropper can observe all the messages that 
are exchanged in the protocol. Since the same holds for 
the SP, the eavesdropper can learn at most what the SP 
learns. Since in this paper we prove that the SP does 
not acquire any location information, then the same 
holds for an eavesdropping adversary. 

The techniques we present in this paper not only 
guarantee each user's privacy requirement against these 
two adversary models, but also in the case of a set of 



colluding buddies. In Section [5.1.2| we also discuss which 
privacy guarantees are provided by our techniques in 
case one or more buddies collude with the SP. 

In both adversary models we assume that the ad- 
versary knows: 

— the protocol, 

— the spatial granularities adopted by each user, and 

— an a- priori probabilistic distribution of the locations 
of the users. 

The two models differ in the sets of messages received 
during a protocol run, and in their ability (defined by 
the protocol in terms of availability of cryptographic 
keys) to decrypt the content of the messages. 

The a-priori knowledge of the location of a user A 
is given by a location random variable pri^ with the 
probability mass distribution denoted P{priA). In other 
words, as prior knowledge we assume that the location 
of a user A follows a known distribution given by the 



distribution of the random variable priA- Note that in 
this paper we assume the spatial domain is discrete, i.e., 
a countable set of "pixels" . 

Let M be the set of messages exchanged between 
the entities involved in the service. The adversary can 
compute the a-posteriori probability distribution of the 
location random variable post a as the distribution of 
the location of A under the given messages M and the 
prior knowledge priA '■ 

P{pOStA) = P{l0CA\M,priA) 

Technically, we may view Ioca as a uniform random 
variable over the spatial domain, i.e., the possible loca- 
tion of A when no knowledge is available. 

The condition for privacy preservation is formally 
captured by Definition [11 

Definition 1 Given a user A with privacy requirement 
Ga, and M the set of messages exchanged by the prox- 
imity service protocol in which A is participating, A's 
privacy requirement is said to be satisfied if 

P{locA\M,priA,locA e qa) = P{locA\priA,locA G .9a) 
for all a-priori knowledge priA and all granule gA of 
Ga- 

The above definition requires that the location dis- 
tribution of user A does not change due to the messages 
M, given the a-priori knowledge and the fact that A is 
located in gA- Hence, a privacy violation occurs when 
the adversary acquires, through the analysis of the pro- 
tocol messages, more information about the location of 
A than allowed by her privacy requirements, i.e., when 
the probability distribution of the position of A within 
the region defined by granule gA changes with respect 
to prtA- 

Since we aim at complete location privacy with re- 
spect to the SP, we use gA to be the entire spatial do- 
main in the above definition when the SP is concerned. 
In this case, the definition requires P(locA\M,priA) = 
P{loCA\priA), i-e., P{postA) = P{priA) or no new loca- 
tion information for each user A. In this case, we also 
say that A's privacy requirement is satisfied with respect 
to the SP. For the buddies, user A uses a granularity 
Ga, which may not be T. In this case, the definition 
requires that with the additional knowledge of A being 
in a granule, the buddies cannot derive anything more 
(e.g., where within the granule) from the messages ex- 
changed. In this case, we also say that ^'s privacy re- 
quirement is satisfied with respect to the buddies. 

4 Defense techniques 

In this section we present two protocols to preserve 
location privacy in proximity-based services. The pro- 



tocols are called C-Hide&Seek and C-Hide&Hash and 
they guarantee privacy protection of a user A with re- 
spect to both the SP and the buddies of A. 

In order to ensure user's privacy, the two protocols 
adopt symmetric encryption techniques. In the follow- 
ing, we assume that each user A has a key Ka that is 
shared with all of her buddies and is kept secret to ev- 
erybody else. Hence, each user A knows her own key Ka 
and one key Kb for each buddy B. Since we are con- 
sidering a contact-list-based service, this key exchange 
is assumed to be performed with any secure method 
before running our protocols. 

For the sake of presentation, we decompose each 
protocol into two parts: the location update sub-protocol 
is used by a user to provide her location information, 
while the proximity request sub-protocol is used by a 
user to compute the proximity of her buddies. The lo- 
cation update sub-protocol is almost the same in both 
of our proposed solutions, and it is presented in Sec- 
tion |4.1| What really distinguishes C-Hide&Seek and C- 
Hide&Hash is the proximity request sub-protocol, and 
this is described in Sections |4.2| and |4.3[ respectively. 
We conclude this section with a discussion about pos- 
sible technical extensions. 



4.1 The location update sub-protocol 

The location update sub-protocol is run by a user to 
provide location information to the SP. In particular, it 
defines how a user A provides to the SP the encrypted 
index of the granule of Ga where she is located. 

Before describing the sub-protocol, we first discuss 
when it should be run. Consider the following naive pol- 
icy: a user A updates her location only when she crosses 
the boundary between two granules of Ga, reporting 
the index of the new granule. It is easily seen that, inde- 
pendently from how the location update is performed, 
each time this message is received, the adversary learns 
that A is very close to the border between two gran- 
ules, excluding many other locations, and hence violat- 
ing the privacy requirements. Intuitively, the problem 
of the above policy is that the probability that a loca- 
tion update is performed at a given time depends on 
the location from where the message is sent. 

The solution we propose is the following: time is 
partitioned into update intervals and an approximate 
synchronization on these intervals among the partici- 
pating nodes is assumed|jEach update interval has the 
same duration T and is identified by an index. Each 

^ In our current implementation, all the messages sent from the 
SP to the users contain the timestamp of the SP, allowing clients 
to synchronize their clocks using a Lamport-style algorithm. The 
overhead due to this solution is negligible. Other forms of global 



user has a value t in [0, T) and sends exactly one lo- 
cation update during each update interval after that 
time t elapses from the beginning of the interval (see 
Figure [T]). It is easily seen that, by using this update 
policy, the location updates are issued independently 
from the location of the users. 
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Fig. 2 Location update sub-protocol in C-Hide&Seek. 
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Fig. 1 Location update policy and generation of single-use keys. 



We now describe how the location update sub-protocol 
works. User A first computes the index i of the gran- 
ule of Ga where she is located. Then, A encrypts i 
using a slightly different technique in the two proposed 
solutions. In the C-Hide&Seek protocol a symmetric 
encryption function E is applied, while in the C-Hide- 
&Hash protocol a hashing function H is used. When 
applying the hashing function TJ, in order to prevent 
brute-force attacks, a secret key is used as a "salt", i.e., 
a secret key is concatenated to i, and the resulting value 
is given as input to H . In the following, we refer to this 
salt as the "key" used to hash i, and we denote with 
Hxii) the hashing of the value i with key K. 

The safety of the protocols depends on the fact that 
the key used to encrypt or hash i is changed at every 
use. At the same time, we need the key to be shared by 
a user with all of her buddies. While other techniques 
can be adopted to achieve this result, our solution is 
the following: the key Ka that A shares with all of her 
buddies is used to initialize a keystream. When user A 
issues a location update, she computes the key 7^"* as 
the ui-th value of this keystream, where ui is the in- 
dex of the current update interval (see Figure II]). Since 
each user issues a single location update during each 
time interval, this solution ensures that every message 
is encrypted or hashed with a different key. Finally, A 
sends to the SP the message (^4, ui, Ej^^i (i)) if running 
C-Hide&Seek, and {A,ui, Hxui{i)) if running C-Hide&- 
Hash. The SP stores this information as the last known 
encrypted location for A. Figure [2] shows the message 
sent from A to the SP by the C-Hide&Seek protocol. 



clock synchronization could also be used as, e.g., using GPS de- 
vices. 



4.2 Proximity request with C-Hide&Seek 

The proximity request sub-protocol is run by a user that 
wants to discover which of her buddies are in proximity. 
In the C-Hide&Seek protocol, this sub-protocol works 
as follows: When A wants to discover which buddies 
are in proximity, she sends a request to the SP. The SP 
replies with a message containing the last known en- 
crypted location of each buddy of A. That is, for each 
buddy B, A receives a tuple {B,ui,Ex^i{i)) . Since A 
knows Kb and the index ui is in the message, she can 
compute the value K™ used by B to encrypt his loca- 
tion, and hence she can decrypt E^ui (i) . Finally, since 
A also knows Gb, by using i, she obtains the granule 
gB = GB{i) where B is located. A can then compute 
the distance between her exact location and g^i and 
compare it with 5a, finally determining the proximity. 
Figure [3] shows a graphical representation of the sub- 
protocol. 



Prox Req 







< B, ui, EJi)> 
for each buddy B 

Fig. 3 Proximity request sub-protocol in C-Hide&Seelt. 



Note that we are now considering the proximity be- 
tween a point and a region. In this section, we consider 
that a point and a region are in proximity, with respect 
to a distance threshold, if the minimum distance be- 
tween the two objects is less than the threshold. Since, 
in our protocol, the region represents the area where a 
user B is possibly located, this interpretation of proxim- 
ity means that there is a possibility for users A and B to 
actually be in proximity. The same m,inimum, distance 
interpretation has been used in related work on privacy- 
aware proximity computation. Alternative interpreta- 
tions and their effects are discussed in Section [521 

The C-Hide&Seek protocol provides a simple and 
efficient solution that, as will be shown in Section [5] 
completely hides the location of the users to the SP, 
and that also guarantees the privacy requirements with 
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respect to the buddies. However, it reveals exactly the 
maximum tolerable amount of location information {gs 
for user B) to any buddy issuing a proximity request. 
Even if their privacy requirements are guaranteed, users 
would probably prefer to disclose as little information as 
possible about their location when not strictly needed. 
For example, is there an alternative solution that does 
not reveal to a user A the granule information of a 
buddy B if he is not in proximity? 

In the next section we present the C-Hide&Hash 
protocol that provide such a solution and, in general, 
ensures a higher level of privacy. This is achieved at the 
cost of higher computation and communication costs, 
as explained in Section [5. 4[ 



4.3 Proximity request in C-Hide&Hash 

The C-Hide&Hash protocol has two main differences 
with respect to C-Hide&Seek. The first difference is 
that a hash function H is used during the location up- 
date, instead of the encryption function. This is due to 
the requirement in this protocol to avoid revealing the 
relationship between two plaintext values (the granule 
indexes) by observing the relationship among the cor- 
responding encrypted values (see Section p\ for a more 
detailed explanation). Since in this protocol we do not 
need to decrypt the result of the function, but we only 
need to check for equality of encrypted values, hashing 
can be used. As specified in Section [4?Tj each location 
update in C-Hide&Hash from user A to the SP is a 
message containing the tuple {A,ui,Hx^i{i))- 
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Fig. 4 Computation of granules of Gb considered in proximity 
by A 



The second and main difference with respect to C- 
Hide&Seek is the computation of the proximity request 
sub-protocol. The intuition is that when A issues a 



proximity request, she computes, for each of her bud- 
dies B, the set of indexes of granules of Gb such that, if 
B is located in any granule of the set, then B is in prox- 
imity (see Figure Hi). Then, if B provides the granule in 
which he is located, it is possible to reduce the proxim- 
ity problem to the set-inclusion problem, by checking 
if that granule is included in the set computed by A. 
We want to do this set inclusion without revealing to A 
which of the candidate granules actually matched the 
granule of B. 

More precisely, the computation of a proximity re- 
quest in the C-Hide&Hash protocol works as follows. 
When a user A issues a proximity request, she starts 
a two-party set inclusion protocol with the SP. The 
protocol is a secure computation, and consequently the 
SP does not learn whether A is in proximity with her 
buddies, and A only learns, for each of her buddies B, 
whether B is in proximity or not, without learning in 
which granule B is located. The secure computation 
exploits a commutative encryption function C. In addi- 
tion to the keys used in the C-Hide&Seek protocol, at 
each proximity request, the requesting user and the SP 
each generates a random key that is not shared with 
anyone else. We denote these keys Ki for user A and 
K2 for the SP. 

The proximity request sub-protocol is divided into 
three steps, whose pseudo-code is illustrated in Proto- 
col [ij In Step («), user A computes, for each buddy B, 
the set S' of indexes of granules of Gb such that, if B is 
located in one of these granules, then B is in proximity. 
More formally, A computes the set of indexes i such 
that the minimum distance minDist between the loca- 
tion of A and Gsii) is less than or equal to 5a- Then, 
in order to hide the cardinality of 5', A creates a new 
set S by adding to S" some non-valid randomly chosen 
indexes (e.g., negative numbers). This is done to in- 
crease the cardinality of S without affecting the result 
of the computation. The cardinality of S is increased 
so that it is as large as the number sMax{GB,SA) 
that represents the maximum number of granules of Gb 
that intersect with any circle with radius Sa- Note that 
sMax(GB, 6a) can be computed off-line since its values 
depend only on Gb and Sa- In the following, when no 
confusion arises, we use sMax as a short notation for 
sMax{GB, Sa)- 

In Line 8, each element of 5* is first hashed using the 
key ii'"', which is obtained as the ui-ih value generated 
by the keystream initialized with Kb- In this case ui is 
the index of the update interval preceding the current 
one. Then, the result is encrypted, using the commuta- 
tive encryption function C and key Ki that is randomly 
generated. The element composed by the set ES com- 



Protocol 1 C-Hide&Hash: proximity request 

Input: User A knows, the last completed update interval, and the 

proximity threshold 5 a- Also, for each of her buddy B, A knows 

the granularity Gb, the key Kg and the value of sMax(GB,&A)- 

Protocol: 

(i) Client request from A 

1: proxReq = 

2: generate a random key Ki 

3: for each buddy B of A do 

4: S' = {j e N s.t. minDist{locA, GbU)) < Sa} 

5: S" = a set of sA/aa;(GB, (5yi) — \S'\ non-valid random in- 
dexes. 

6: S = S'U S" 

7: iC"* is the ni-th value of the keystream initialized with 
Kb 

8: £;5 = U,6sCki(/^k-W) 

9: insert {B, ui, ES) in proxReq 
10; end for 

11: A sends proxReq to the SP 
(ii) SP response 



proxResp = 

generate a random key K2 

for each {B, ui, ES) in proxReq do 



ES' = U, 



eES 



CkM 



retrieve {B,ui,hB) updated by B at update interval ui 

h' =CK^{hB) 

insert {B, ES' ,h') in proxResp 
end for 

SP sends proxResp to A 
(iii) Client result computation 



for each {B, ES' , h') in proxResp do 

h" = CK,{h') 
if h" e ES' then 

A returns "B is in proximity" 
else 

A returns "B is not in proximity" 
end if 
end for 
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In Step (iii), given the message proxResp received 
from the SP, A computes the proximity of her bud- 
dies. For each tuple {B,ES' ,h'), A obtains h" as the 
encryption of h' with C and the key Ki and checks if 
the resuh is in ES'. If this is the case, then B is in 
proximity, otherwise he is not. 

More formaUy, h" e ES' if and only if the granule of 
Gb with index i containing B is in 5", that is equivalent 
to B being in proximity. Indeed, for each buddy S, we 
recall that: 

h" = CKjCKAhB)) 

and 

ES'^lJiCKACKAHK^^^))) 

■les 

Consequently, due to the commutative property of the 
encryption function, h" G ES' if and only if 

ies 
Since hs and the elements of the set are hashed using 
the same key ii""% hs is in the set if and only ii i G S. 
Since S = S' U S" and i ^ S" (because S" contains 
invalid integers only while i is a valid integer) then i G S 
if and only if i G 5". By definition of S' , this implies 
that B is in proximity. 

Figure [5] shows the messages exchanged during the 
proximity request sub-protocol of C-Hide&Hash. 

ES is the union of 
C (H (i)) 

for each / in S 



puted in Line 8, -B, and ui is then added to the set 
proxReq. 

Once the operations in Lines 4 to 9 are executed for 
each buddy B, the set proxReq is sent to the SP. 

Upon receiving proxReq, the SP starts Step (ii). For 
each tuple {B, ui, ES) in proxReq, the SP encrypts with 
the C function each element of ES using key K2, which 
is randomly generated. The result is the set ES' . Then, 
it retrieves the tuple {B,ui,hB) updated by B at the 
update interval ui. In this tuple, Hb is the value of the 
index of the granule of Gb where B is located, hashed 
with the key X"*. Since ui is the update interval preced- 
ing the current one, our location update policy assures 
that a location update with update interval ui has al- 
ready been issued by every buddy B. Finally, the SP 
encrypts Hb with the commutative encryption function 
C using key K2. The resulting value h' is added, to- 
gether with B and ES' , to the set proxResp. 

Once the computations at Lines 4 to 7 are executed 
for each buddy B, the set proxResp is sent to A. 




< B, ui,(ES> 
for each buddy B 



proxReq 



SP 




<b;:es',)C^ (hj> 

for each buddy B 



W^ 



ES' is the union of 
C (C (H .(i))) 

for each / in S 

Fig. 5 Proximity request sub-protocol in C-Hide&Hash. 



4.4 Contrasting velocity attacks and other background 
knowledge 

It is easily seen that our location update policy, based 
on fixed length update intervals, makes the probabil- 
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ity that a location update is issued independent from 
the location from where it is issued. This is an impor- 
tant property used in Section [5l together with others, 
to prove the safety of our solutions under the adversary 
models we consider. 

Clearly, if the adversary had arbitrary background 
knowledge, there would not be any technique that could 
guarantee privacy. However, it is interesting to con- 
sider some other forms of knowledge that the adversary 
could use. With respect to previous proposals, our de- 
fenses are resistant to an important type of background 
knowledge: a-priori distribution of the users' locations. 
There are, however, other types of knowledge that may 
be interesting to consider as, for example, the time- 
dependent a-priori location knowledge. This includes 
knowledge on the relative position of users at a certain 
time, as well as a-priori probability of user movements. 
With this kind of knowledge it is also possible to per- 
form attacks based on the velocity of users. Consider 
Example [l] 

Example 1 User A sends two location updates in two 
consecutive update intervals i and j from granule gi and 
(72, respectively. Her buddy B issues a proximity request 
in each update interval and discovers the granule where 
A is located. So far, no privacy violation occurred for 
A. However, if B knows that A moves at most with 
velocity w, then he can exclude that A is located in 
some locations / of g2- Indeed, B knows that the tem- 
poral distance between the two location updates of A is 
equal to the length T of the update period. Now B can 
exclude that A is located in any location I of 52 such 
that the time required to move from any point of gi to 
I with velocity v is larger than T. Hence B violates the 
privacy requirement of A. 

The problem in Example IT] arises when the adver- 
sary knows the maximum velocity of a user. Velocity- 
based attacks have been recently considered indepen- 
dently from proximity services [5], but the application 
of those solutions in our framework would lead to the re- 
lease of some location information to the SP. In the fol- 
lowing we show how to adapt our location update policy 
to provide protection preserving our privacy properties 
in the specific case in which the adversary knows the 
maximum velocity w of a user. 

Let tMax{gi,g2) be the maximum time required to 
move at velocity v from each point of granule gi to each 
point of granule 172 ■ The problem of Example [l] arises 
when the temporal distance between two location up- 
dates issued from two different granules gi and (72 is 
less then tMax{gi,g2)- The problem can be solved by 
imposing that A, after entering 172, randomly reports 
91 or g2 as the granule where she is located until time 



tMax{gi, g2) elapses from the last location update in 
gi. This solution is a form of temporal generalization 
as it adds uncertainty to the adversary, about when the 
user crosses the border between gi and 32- More specif- 
ically, the adversary is unable to identify the exact in- 
stant in which the user crossed the border in a time 
interval of length at least tMax{gi, g2)- Consequently, 
by definition of tMax{gi, 32), the adversary cannot ex- 
clude that A moved from any point of gi to any point 

of 52- 

The extension of our defense techniques to other 
forms of background knowledge is one of the subjects 
for future work. 



5 Analysis of the protocols 

The main goal of our techniques is to guarantee the sat- 
isfaction of users' privacy requirements under the given 



adversary models. In Section 5.1 we prove that our two 
protocols have this property. 

However, there are other important parameters to 
be considered in an evaluation and comparison among 
protocols that satisfy the privacy requirements. In gen- 
eral, the higher the privacy provided by the protocol, 
the better is for the users; since location privacy in our 
model is captured by the size of the uncertainty region, 
in Section [5. 3 1 we consider this parameter. 

A second parameter to be considered is service pre- 
cision. The percentage of false positives and false neg- 
atives introduced by a specific protocol must be evalu- 
ated. This is considered in Subsection 15.21 

Last but not least, it is important to evaluate the 
overall system cost, including computation and commu- 
nication, with a particular attention to client-side costs. 
This is considered in Subsection 15.41 

The proofs of the formal results presented in this 
section are in Appendix [K\ 



5.1 Privacy 

We first analyze the privacy provided by C-Hide&Seek 



and C-Hide&Hash in Section 5.1.1 considering the ad- 
versary models presented in Section [3] under the no- 
collusion assumption, i.e., assuming that the SP does 
not collude with the buddies and that the buddies do 
not collude among themselves. Then, in Section |5.1.2| 
we show the privacy guarantees provided by the two al- 
gorithms in the more general case of possibly colluding 
adversaries. 
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5.1.1 Satisfaction of privacy requirements 

We first analyze the C-Hide&Seek protocol. Since the 
private key Ka is only known to A and to the buddies of 
A, the SP is not able to decrypt the index of the granule 
where A is located. Analogously, the SP is not able to 
obtain location information about A's buddies and, in 
particular, does not obtain any information about the 
distance between A and her buddies. 

We now state a formal property of the C-Hide&Seek 
that is used in the formal proof of the above observa- 
tions. 

Lemma 1 The C-Hide&Seek protocol ensures that un- 
der any a-priori knowledge pri a, the following two ran- 
dom variables are probabilistically independent: (1) The 
binary random variable ur{A): an update/request is sent 
by user A, and (2) random variable Ioca, i-e., the loca- 
tion of A, of any distribution. Formally, we have 

P{ur{A)\locA,priA) = P{ur{A)\priA), 

for any a-priori location knowledge priA and location 
random variable lac a for user A. 

Note that we are assuming discrete time and dis- 
crete location. A continuous case can be formalized and 
proved equally easily. Also, this lemma does not concern 
the type or content of a message sent by A, but just the 
fact that a message is sent by A. 

Another property we use to prove our safety result 
is provided by the encryption algorithms, via the infor- 
mation theoretical notion of "perfect secrecy" [4j. In- 
tuitively, perfect secrecy for an encryption algorithm 
means that given ciphertext c, each plaintext p has the 
same probability to be encrypted to c (posterior), with 
a randomly chosen key, as the probability of p to be 
used in the first place (prior). That is, P{p\c) — P{p). 
Equivalently, given plaintext p, each ciphertext c has 
the same probability to be the encryption of p (poste- 
rior) , with a randomly chosen key, as the probability of 
c to appear in the first place as ciphertext (prior) . That 
is, P{c\p) = P{c). Applied to our situation, when SP 
receives a message {A,ui,Ex^i{l)), since X"* is hidden 
from the SP and can be chosen arbitrarily, the proba- 
bility that SP receives any other message of the form 
{A,ui,Ei(ui{l')) is the same. 

Most of practical encryption algorithms do not have 
the theoretical perfect secrecy, but use computational 
hardness to achieve secrecy in the sense that it is com- 
putationally very hard (or impractical) to derive the 
plaintext from the ciphertext. Intuitively, P(p\c) = P{p) 
holds because c does not yield any information about p. 
Therefore, we use the simplifying, practical assumption 



that the encryption methods we use do give us perfect 
secrecy. 

The above perfect secrecy discussion applies to sin- 
gle messages. When dealing with multiple messages, 
correlation between plaintexts may reveal secrets when 
the same key is used. This is the classical scenario of 
repeated key use problem, and one solution to this prob- 
lem is to use so-called one-use-pad or keystreams as we 
do in our proposed protocols. As each key is only used 
once, encrypted messages are independent to each other 
when perfect secrecy is assumed. 

From the above discussion and assumptions, Lemma[2] 
follows. Since the lemma involves random variables on 
messages, we need to specify the message space for these 
variables. We consider the randomness of the messages 
to be on the encrypted part, while other parts are fixed. 
Formally, we call each sequence {Bi,uii), . . . , (i?„,wi„), 
where Bj is a user and uij is a time interval, a (mes- 
sage set) type. (Recall that a message is of the form 
{B,ui,ES).) The messages of the same type differ on 
the encrypted part of the messages and constitute a 
message space. When a generic message M is men- 
tioned, we assume it is a variable over all the messages 
with a specific type. 

Lemma 2 Given messages M — Mi U M2 issued in the 
C-Hide&Seek protocol, where Mi n M2 — 0, we have 

P(M\locA,priA) = P{Mi\locA,priA)*P{M2\locA,priA), 

for all a-priori knowledge priA and location Ioca for 
user A. 

With Lemma [Tl perfect secrecy, and Lemma [2] we 
now show a main result, namely, the SP does not ac- 
quire any location information as a consequence of a 
location update or a proximity request using the C- 
Hide&Seek protocol. The following formal results im- 
plicitly refer to our adversary models that, in particu- 
lar, assume that the SP has no background knowledge 
other than the protocol, the a-priori distribution, and 
the granularities. 

Theorem 1 Let A be a user issuing a sequence of lo- 
cation updates and proximity requests following the C- 
Hide&Seek protocol. Then, A 's privacy requirement is 
satisfied with respect to the SP. 

We now turn to the location information acquired 
by the buddies. In the C-Hide&Seek protocol, a user A 
issuing a proximity request does not send any location 
information, hence her buddies, even if malicious, can- 
not violate her privacy requirements. When the same 
user runs the location update subprotocol in C-Hide- 
&Seek, her buddies can only obtain the granule at the 
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granularity Ga in which A is located. As a consequence, 
the privacy requirement of A is guaranteed. This is for- 
mally stated in Theorem [2] 

Theorem 2 Let A be a user issuing a sequence of lo- 
cation updates and proximity requests following the C- 
Hide&Seek protocol. Then, A 's privacy requirement is 
satisfied with respect to each of A 's buddies. 

We consider now the C-Hide&Hash protocol. Since 
Ka is only known to A and her buddies, the SP is not 
able to acquire the location information provided by 
A during a location update. This follows from Theo- 
rem [I] The difference of the C-Hide&Hash from the C- 
Hide&Seek is that when A issues a proximity request 
in C-Hide&Hash^ an encrypted message is sent to the 
SP. However, due to the property of the secure compu- 
tation protocol in C-Hide&Hash, the only information 
that the SP acquires about the set provided by A is its 
cardinality. Actually, the cardinality of this set is always 
Smax that, by definition, depends only on 5a and Gb, 
and not on the actual location of A or B. Consequently, 
the SP does not acquire any information about the lo- 
cation of A and B, including their distance. Theorem|3] 
formally states this property. 

Theorem 3 Let A be a user issuing a sequence of lo- 
cation updates and proximity requests following the C- 
Hide&Hash protocol. Then A's privacy requirement is 
satisfied with respect to the SP. 

Similarly to the C-Hide&Seek protocol, in C-Hide- 
&Hash each buddy of A can only obtain location in- 
formation derived from A's location update. It is worth 
noting that in the C-Hide&Seek protocol, each time B 
issues a proximity request, he obtains the granule of Ga 
where his buddy A is located. Differently, using the C- 
Hide&Hash protocol, B only gets to know whether the 
granule where A is located is one of those in Sa- This 
means that, if A is not in proximity, then B only learns 
that A is not in any of the granules of Sa- Otherwise, if 
A is in proximity, B learns that A is in one of the gran- 
ules of Sa, without knowing exactly in which granule 
she is located. This is formally stated in Theorem [3] 

Theorem 4 Let A be a user issuing a sequence of lo- 
cation updates and proximity requests following the C- 
Hide&Hash protocol. Then, A 's privacy requirement is 
satisfied with respect to each of A's buddies. 

In Section[7]we show that, on average, C-Hide&Hash 
provides more privacy with respect to the buddies than 
C-Hide&Seek, but at extra costs, making each protocol 
more adequate than the other based on user preferences 
and deployment modalities. 



5.L2 Privacy in case of possibly colluding adversaries 

We now consider the case in which our reference adver- 
saries can collude, and we analyze the privacy guaran- 
tees of the C-Hide&Hash and C-Hide&Seek protocols 
in this scenario. 

First, consider the case in which two buddies B and 
C collude to violate the privacy of a user A. The prob- 
lem can be easily extended to consider more buddies. 
Let Ib be the set of possible locations of A obtained 
by i? as a result of a proximity request. Let Ic be the 
analogous information acquired by C during the same 
update interval. Since B and C collude, they can derive 
that A is located in IbI^Ic- However, due to Theorem|4J 
given GA{i) the granule where A is located, it holds 
that Ib 2 Ga(«) and Ic 2 Ga(J) (recall that Ga is the 
privacy requirement of A with respect to the buddies). 
Consequently, Ib C\lc 3 GA{i) and hence the privacy 
requirement of A is guaranteed also in the case B and 
C collude. 

Now, consider the case in which the SP colludes 
with one or more buddies. For example, if one of the 
buddies shares the secret key Ka with the SP, the SP 
can learn the granule where A is located. In this case, 
the privacy requirement of A with respect to the SP is 
not guaranteed. Nevertheless, even if the SP knows Ka, 
he cannot discover the location of A within the granule 
of GAii) where A is located. This is because, by the 
definition of the two protocols, every message issued by 
A does not depend on the location of A within GA{i)- 
Consequently, the privacy requirement with respect to 
the buddies is still guaranteed. This means that the 
lowest privacy requirement of the two colluding entities 
is preserved and this is the best that can be achieved 
in case of collusion. 



5.2 Service precision 

The techniques proposed in the literature as well as the 
techniques we propose in this paper, generalize the loca- 
tion of one of the two users to an area. When proximity 
is computed, the exact location of that user within the 
area is not known. Hence, proximity is evaluated as the 
distance between a point and a regiorj^ 

Consider how it is possible to compute the prox- 
imity between a user A whose exact location is known 
and a user B whose location is only known to be in 
region. It is easily seen that if the maximum distance 
between the point and the region is less than the prox- 
imity threshold, then the two users are in proximity, 

^ In previous work, the location of both users is generalized 
and proximity is computed between two regions. 



15 



independently from where B is located within the re- 



gion. Figure 6(a) shows an example of this situation. On 



the contrary, if the minimum distance is larger than the 
distance threshold, then the two users are not in prox- 
imity. Figure [6 (b) | graphically shows that this happens 
when no point of the region containing B is in proxim- 
ity of A. If none of the two cases above happen (i.e., the 
threshold distance is larger than the minimum distance 
and less than the maximum distance), we are in pres- 
ence of an uncertainty case, in which it is not possible to 
compute whether the two users are in proximity with- 
out introducing some approximation in the result. For 



example. Figure 6(c) shows that if B is located close to 



the bottom left corner of the region then B is in the 
proximity of A, otherwise he is not. 
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An alternative semantics, that we name maximum- 
distance semantics, solves this problem. The idea is to 
consider two users as in proximity only when it is cer- 
tain that they are actually in proximity. This happens 
when the maximum distance between their areas is less 
than the distance threshold. While this approach does 
not generate any false-positive, it does produce false- 
negatives. The two semantics above have a common 
drawback: in certain cases it happens that the proba- 
bility of providing a false result is larger than the prob- 
ability of providing a correct result. Consider the exam- 
ple depicted in FigurelTlin which the minimum- distance 
semantics is considered. User B is considered in proxim- 
ity but the answer is wrong if B is located in the region 
colored in gray. Assuming a uniform distribution of B 
inside gs, it is much more likely to have an incorrect 
result, rather than a correct one. An analogous problem 
can arise for the maximum- distance approach. 

The percentage of false results can be minimized 
by considering user B as in proximity only when at 
least one half of the area is actually in proximity. The 
drawback of this mostly-in-proximity semantics is that 
it incurs in both false positive and false negative results. 

Our protocols are designed so that it is very easy 
to change the current proximity semantics. Since this 
can be done client-side, without the need for changes 
server-side nor in the code other peers are running, the 
semantics can be potentially chosen through the user 
interface at any time. 



(c) B is 

in proximity of A 



Fig. 6 Regions La and Lb 



The choice we made in the presentation of our pro- 
tocols is to consider two users as in proximity in the 
uncertainty case. The rational is that in this case it is 
not possible to exclude that the users are not in proxim- 
ity. Previous approaches ( [5UIIT5] ') facing a similar issue 
have adopted the same semantics. 

One drawback of this minimum- distance semantics 
is that it generates false positive results and this may 
be undesirable in some applications. Indeed, if user B 
is reported to be in proximity of A, then A may decide 
to contact B (e.g., through IM). This may be annoying 
for _B, if he is not actually in proximity. Consider, for 
example, the case in which the location of B is reported 
at the granularity of a city: B is always reported as in 
proximity of A when A is in the same city, indepen- 
dently from the proximity threshold chosen by A. 




Fig. 7 Approximation incurring with the minimum- distance se- 
mantics 



We analytically measured the impact of the different 
semantics on the accuracy of our protocols by calculat- 
ing the expected precision and the expected recall. The 
expected precision is defined as the probability that a 
buddy reported to be in proximity according to a given 
semantic is actually in proximity. Vice versa, the ex- 
pected recall is defined as the probability that a buddy 
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actually in proximity is reported to be in proximity ac- 
cording to a given semantic. 
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Figures IS] and l9] show the minimum expected preci- 
sion and recall for the minimum- distance and the maxi- 
mum-distance semantics. Both measures depend on the 
ratio between 6 and the area of the granules in which 
a user is considered in proximity. For this analysis we 
considered a grid-like granularity containing cells hav- 
ing edge of size I and we assume users are uniformly 
distributed. As can be observed in Figure |8j the m,ax- 
im,um- distance semantic has always precision equal to 
1. This is because all the buddies considered in prox- 
imity are always actually in proximity. The minimum- 
distance has precision of about 1/3 when the values of 
6 and I are equal, and this value grows logarithmically 
when S is larger than I. The analysis of expected re- 
call (Figure^ shows that the minimum- distance has 
always recall equal to 1. This is because if a buddy is 
actually in proximity, it is always reported in proximity 
using this semantic. The maximum- distance semantic. 



on the contrary, has a minimum expected recall equal 
to when S and I are equal. This is because, with this 
parameters, it can happen that no cells of size / are fully 
contained in a circle having radius S. However, the re- 
call of the maximum- distance grows more rapidly than 
the precision of the minimum- distance . 



5.3 Size of uncertainty regions 



As already discussed in Section 5.1 our protocols are 
proven to always guarantee the privacy requirement 
with respect to the buddies. However, the main dif- 
ference between our two protocols consists in the fact 
that C-Hide&Hash can provide additional privacy with 
respect to one buddy. For example, if a user A issues 
a proximity request using C-Hide&Hash, and a buddy 
B is reported as being not in proximity, A only learns 
that B is not located in any of the granules considered 
in proximity (i.e., the ones included in S). The result- 
ing uncertainty region of B, in this case, is equal to the 
entire space domain minus the region identified by S. 
When B is reported to be in proximity, A learns that 
B is located in one of the granules of 5, but not exactly 
in which of those granules. Therefore, the uncertainty 
region in this case is given by the region identified by 
S. The size of this region depends on the value 8a, on 
the area of the granules in Gb , and on the distance se- 
mantics chosen by A. In order to show how the size of 
the uncertainty region is affected by these parameters, 
we simplify the analysis by considering grid-like gran- 
ularities, similarly to Section [5]2j Each granularity is a 
grid identified by the size I of the edge of its cells. 



Figure 10 shows the additional privacy achieved by 
C-Hide&Hash for different values of 5/1. The additional 
privacy is measured as the lower bound of the number 
of granules in S. As can be observed, using both seman- 
tics, the additional privacy grows when 5 is larger than 
I. This means, for example, that if J is 5 times larger 
than Z, then the actual size of the uncertainty region 
of B is 60 (or 88) times larger than the minimum pri- 
vacy requirement if A is using the maximum- distance 
(or minimum- distance , resp.) semantics. 



5.4 System costs 

We separately evaluate the computation and commu- 
nication costs involved in running the two proposed 
protocols. The analytical evaluation reported here is 
complemented with experimental results in Section [7J 
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Fig. 10 Privacy with respect to a buddy 

54.1 C-Hide&Seek 

In order to perform a location update, a user needs to 
compute the index of the granule where she is located. 
The time complexity of this operation depends on the 
data structure used to represent granularities. As we 
shall show in Section [7] with our implementation of the 
granularities this operation can be performed in con- 
stant time. The complexity of the encryption operation 
depends on the encryption function and on the length 
of the encryption key. Considering a fixed key length, 
the encryption of the index of the granule can be per- 
formed in constant time. Since the SP only needs to 
store the received information, the expected computa- 
tional complexity is constant. The communication cost 
is constant and consists in an encrypted integer value. 
For what concerns the cost of a proximity request 
on the client side, for each buddy the issuing user needs 
to decrypt the index and to compute the distance of 
the granule with that index from her location. In our 
implementation these operations can be performed in 
constant time and hence the time complexity of the 
proximity request operation on the client side is linear 
in the number of buddies. On the SP side, the compu- 
tational cost to retrieve the last known locations of the 
buddies is linear in the number of buddies. The com- 
munication consists in one request message of constant 
size from the user to the SP, and of one message from 
the SP to the user with size linear in the number of 
buddies. 



5.4.2 C-Hide&Hash 

The cost of a location update operation on the client 
is similar to the cost of the same operation using C- 
Hide&Seek, since the only difference is that a hashing 
function, which can be computed in constant time, is 
applied instead of the encryption function. Like in C- 



Hide&Seek^ the SP only needs to store the received in- 
formation. Hence, computational costs of a location up- 
date are constant both for the client and for the SP. The 
communication cost is constant, as the only exchanged 
message consists in a hashed value. 

On the client side, a proximity request from A re- 
quires, for each buddy B, the computation of the gran- 
ules of Gb which are considered in proximity, the hash- 
ing, and the encryption of a number of granule in- 
dexes in the order of sMax{GB ,5a)- The value of sMax 
can be pre-computed for a given granularity. The com- 
putation of the granules considered in proximity can 
be performed in constant time in our implementation, 
using grids as granularities. The computation of the 
hashing and the encryption functions can also be per- 
formed in constant time, hence the time complexity of 
a proximity request is linear in the number of buddies 
times the maximum among the sMax values for the 
involved granularities. When the client receives the re- 
sponse from the SP, the result computation performed 
by A for each buddy B requires the encryption of a 
number (the encrypted value sent by the SP), and the 
lookup of the encryption in a set of encrypted values 
with cardinality sMax(GB,SA)- As the lookup in the 
set of hashes requires at most sMax operations, the 
time complexity is then linear in the number of bud- 
dies times the maximum value of sMax. Hence, this is 
also the overall complexity on the client side. On the 
SP side, the response to a proximity request from a user 
A requires, for each buddy B, a) the retrieval and the 
encryption of the hashed location of B, b) the encryp- 
tion of the sMax(G'B, (5yi) hashed granule indexes sent 
by A. As the encryption runs in constant time, the time 
complexity is linear in the number of buddies times the 
maximum value of sMax. 

Regarding the communication costs, both of the mes- 
sages involved in the proximity request sub-protocol 
contain the encryption of a set of a number of hashed 
values linear in the number of buddies times the maxi- 
mum value of sMax. 

6 System implementation 

We implemented the techniques presented in Section [4] 
in a system that provides proximity notification cou- 
pled with typical instant messaging (IM) functionali- 
ties. This implementation is the evolution of the sys- 
tem developed for the Hide&Crypt protocol and it has 
similar architecture, server and client applications [5]. 
The system is built as an extension of XMPP (Ex- 
tensible Messaging and Presence Protocol), an open 
standard protocol often used in commercial applica- 
tions as a message oriented middleware [22]. The sys- 
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tern architecture is shown in Figure |11[ The choice of 
extending XMPP is driven by the following considera- 
tions. First, the XMPP protocol can be easily extended 
to support custom services and messages, like the prox- 
imity service, in our case. In particular, by extend- 
ing XMPP messages, we designed a proper XML pro- 
tocol for each of our technique. In addition, the SP 
providing the proximity services is implemented as a 
XMPP component i.e., a pluggable entity that extends 
the default XMPP functionalities. A second advantage 
is that the XMPP protocol already includes standard 
sub-protocols for client-to-client communication and for 
managing the list of buddies. We used these sub-protocols 
as primitives in our implementation. Finally, since the 
XMPP architecture is decentralized, clients running on 
different servers can communicate with each other. In 
our case, since a component acts as a special type of 
client, this means that our proximity service is acces- 
sible to a user registered to an existing XMPP service, 
including popular IM services like Google Talk or Jab- 
ber. This makes it possible to use, in the proximity ser- 
vice, the same list of buddies used in those IM services. 
Clearly, proximity can be computed only for those bud- 
dies that are participating in the same proximity ser- 
vice. 

For what concerns the client, we developed a multi- 
platform web application and an other application specif- 
ically designed for mobile devices based on Apple iOS 
operating system, like iPhone or iPod touch. In addi- 
tion to the typical functionalities of an IM application, 
the clients implement the proximity protocols described 
in Section [3] and provide the typical functionalities of 
a full-fledged proximity service, including the detection 
of the client user's location, the notification of any bud- 
dies in proximity, and the graphical visualization of the 
location uncertainty region for each buddy. 

One of the issues emerged during the implementa- 
tion of the C-Hide&Hash and C-Hide&Seek protocols 



concerns key management. Indeed, both protocols re- 
quire that each user A has a key Ka that is shared 
with all of her buddies, and it is kept secret to every- 
body else. A first problem is how A can share her key 
with one buddy _B in a secure manner. This operation 
is required, for example, when the user accesses the 
proximity service for the first time or a new buddy is 
added to the buddy list. To address this problem, we 
employ standard public key cryptography techniques to 
encrypt, for each buddy of a user A, the key Ka', After 
being encrypted, the key can be safely transmitted over 
an insecure channel. The second problem is how to re- 
voke a secret key. For example, this is necessary when 
a buddy is removed from the buddy list, or when the 
key is compromised. In our implementation, in order to 
revoke a key, it is sufficient to generate a new secret key 
and to send it to the authorized buddies. 

The cost of sending a key to all the buddies is clearly 
linear in the number of buddies. In Section [7] we show 
that the costs to perform this operation on a mobile de- 
vice are sustainable. In addition, it should be observed 
that the distribution of the key to all the buddies is 
only needed when a user first subscribes to the proxim- 
ity service or when a buddy is removed from the buddy 
list. These are very sporadic events during a typical IM 
service provisioning. 



7 Experimental results 

We conducted experiments to measure the performance 
of our protocols and to compare them with the Pierre, 
FriendLocator, Hide&Seek and Hide&Crypt protocols 
[5ni[2Hl[T5] . We present the experimental setting in Sec- 



tion 7.1 Then, in Sections 7.2 7.3 and 7.4 we eval 



uate the protocols according to three evaluation crite- 
ria: quality of service, privacy and system costs, respec- 
tively. 



7.1 The experimental setting 

The experimental evaluation of the protocols presented 
in this paper was performed on a survey-driven syn- 
thetic dataset of user movements, which was obtained 
using the MilanoByNight simulatiorj^ We carefully tuned 
the simulator in order to reflect a typical deployment 
scenario of a proximity service for geo-social networks: 
100, 000 potential users moving between their homes 
and one or more entertainment places in the city of Mi- 
lan during a weekend night. The simulation also models 
the time spent at the entertainment places, i.e., when 



http : //everywarelab . dico . unimi . it/lbs-datasim 
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no movement occurs, following probability distributions 
extracted from user surveys. All the test results shown 
in this section are obtained as average values computed 
over 1, 000 users, each of them using the service during 
the 4 hours of the simulation. Locations are sampled 
every 2 minutes. The total size of the map is 215 km^ 
and the average density is 465 users/km^. All the com- 
ponents of the system are implemented in Java. Server- 
side test were performed on a 64-bit Windows Server 
2003 machine with 2.4Ghz Intel Core 2 Quad processor 
and 4GB of shared RAM. Client-side tests were run on 
an iPhone 4 mobile device, running Apple iOS 4.1 as 
operating system. We implemented the symmetric en- 
cryption and the hashing functions using the RC4 and 
MD5 algorithms, respectively, while the RSA public key 
encryption algorithm was used for the key distribution. 
In the experiments we used grid-based granularities. 
Each granularity is identified by the size of the edge of 
one cell of the grid. The location-to-granule conversion 
operations required by our protocol can be performed 
in constant time. For the sake of simplicity, in our tests 
we assume that all the users share the same parame- 
ters and that each user stays on-line during the entire 
simulation. Table [l] shows the parameters used in our 
experiments. Note that the "number of buddies" pa- 
rameter refers to the number of on-line buddies that, 
for the considered type of application, is usually signif- 
icantly smaller than the total number of buddies. 

Table 1 Parameter values 



Parameter 


Values 


<5 


200m, 400m, 800m, 1600m 


Edge of a cell of G 


100m, 200m, 400m, 800m 


Number of buddies 


10, 25, 50, 75, 100 



7.2 Evaluation of the quality of service 

The first set of experiments evaluate the impact of the 
techniques on the quality of service, by measuring the 
exactness of the answers returned by each protocol. In- 
deed, two forms of approximation are introduced by our 
protocols. The granularity approximation is caused by 
the fact that, when computing the proximity between 
two users, the location of one of them is always gener- 
alized to the corresponding granule of her privacy re- 
quirement granularity. The other approximation, which 
we call the time- dependent approximation, is due to the 
fact that, when a user issues a proximity request with 
C-Hide&Seek, proximity is computed with respect to 
the last reported location of each buddy. The approxi- 
mation is introduced because the buddies have possibly 
moved since their last location update. Similarly, during 
the computation of a proximity request with C-Hide&- 



Hash, the location transmitted by each buddy during 
the previous update interval is used. 

For what concerns the granularity approximation, 
a similar problem occurs with the Pierre and Friend- 
Locator protocols too. Indeed, both protocols, in order 
to detect proximity between buddies, partition the do- 
main space into a grid, with each cell having edge I 
equal to the distance threshold 5, that must be shared 
by the users. Then, a buddy B is considered in prox- 
imity of A whether B is located in the same cell as 
A or in one of the 8 adjacent cells. The approxima- 
tion introduced by these techniques depends entirely on 
the chosen value of 5. Differently, in our solutions, each 
user can choose her privacy requirements independently 
from the value of 5. For example, consider Figure [12] 
The black dot is the actual location of user A. The dark 
gray circle with radius 5 is the area where the buddies 
of A are actually in proximity of A. The light gray area 
is the region in which buddies are erroneously reported 
to be in proximitjr] Considering Figure 



12(a) as / is al- 



ways equal to 5 when using Pierre or FriendLocator, 
the total area of the 9 cells considered in proximity is 
9(5^, while the area of the circle is ttS'^ , which is almost 3 
times smaller. This means that, assuming a uniform dis- 
tribution of the users, using Pierre or FriendLocator 
the probability that a buddy reported as in proximity 
is actually in proximity is about 1/3. On the contrary, 
in the protocols presented in this paper the size of the 
granules is independent from the chosen 5. In our ex- 
ample, this means that when the value / is smaller than 
5, the region in which users are erroneously reported in 



proximity becomes smaller (Figure 12(b)) 



Figure 13(a) shows how the granularity approxima- 
tion impacts on the service precision for different val- 
ues of the edge of granularity cells. The metric we use 
for the measurement is the information retrieval notion 
of precision: the ratio between the number of correct 
"in proximity" answers over the total number of "in 
proximity" answers. Intuitively, the precision measures 
the probability that a buddy reported "in proximity" 
is actually in proximity. Note that the analysis would 
be incomplete without considering the notion of recall: 
the ratio between the number of correct "in proxim- 
ity" answers over the sum of correct "in proximity" 
and incorrect "not in proximity" answers. Intuitively, 
the recall measures the probability that a buddy ac- 
tually in proximity is reported "in proximity" . In this 
case, since we are considering the minimum- distance 



semantics (see Section 5.2), the granularity approxima- 
tion does not produce any incorrect "not in proximity" 
answer, and hence the recall is equal to 1. When con- 

^ Here and in the following, we assume users of our protocols 
are choosing the minimum- distance semantics 
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ducting this experiment, in order to exclude from the 
evaluation the effects of the time- dependent approxi- 
mation^ for each buddy we used his current location as 
the last reported location. Since Pierre and Friend- 
Locator do not consider G, their precision is constant 
in the chart and, as expected, is below 0.4. On the con- 
trary, C-Hide&Seek and C-Hide&Hash have a signif- 
icantly better precision when the edge of the cells of 
G is small. Intuitively, this is because the area where a 
buddy is erroneously reported as in proximity is smaller 
than 5 (see Figure 12(b)[ ) . Figure 13(a)| also shows the 
precision when the edge of a cell of G is larger than 
5] The values are not reported for Pierre and Friend- 
Locator since in this case they do not guarantee the 
privacy requirements. 



Figure [l3(b)| shows the impact of the time- dependent 
approximation. The chart shows the results for our pro- 
tocols only, as the other protocols proposed in the lit- 
erature are not exposed to this kind of approximation. 
In order to exclude from this evaluation the effects of 
the granularity approximation, we performed these tests 
with the exact locations of the users, instead of the gen- 
eralized ones. The chart shows, on the x axis, different 
lengths of the update interval and, on the y axis, the 



precision of the C-Hide&Seek and C-Hide&Hash pro- 
tocols. It can be observed that C-Hide&Seek has better 
precision. This is due to the fact that C-Hide&Hash 
always uses the location reported during the previous 
update interval, while Hide&Seek uses the last loca- 
tion, that can be the one reported during the current 
update interval or during the previous one. Since the 
time- dependent approximation also introduces incorrect 
"not in proximity" answers, we also measured the re- 
call. The corresponding chart is omitted as it is almost 
identical to the one in Figure [T3(b)[ For example, using 
C-Hide&Hash and an update interval of 4 minutes, the 
value of the precision is 0.89 and the recall is 0.88. 

The computation of the precision and recall under 
the time- dependent approximation confirms the intu- 
ition that using long update intervals negatively im- 
pacts on the quality of service. The choice of a value 
for the update interval should consider, in addition to 
this approximation, the cost of performing a location 
update. In general, the optimal value can be identified 
based on specific deployment scenarios. Considering our 
movement data, we chose 4 minutes as a trade off value 
since it guarantees precision higher than 0.9 and sus- 



tainable system costs as detailed in Section 7.3 Our 
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choice is consistent with similar proximity services Hke, 
for example, Google Latitude that currently requires 
location updates every 5 minutes. 

Figure [T4| shows the analysis of the quality of service 
considering both the granularity and time- dependent 



Figure 14(a) we can observe that the accuracy achieved 



approximations. Figure 14(a) shows the precision of our 
two protocols compared with the precision of Pierre and 
FriendLocator. We represent the precision of C-Hide- 
&Seek and C-Hide&Hash with a single curve because 
the two protocols behave similarly. For example, when 
the edge of a cell of G is 200m, the precision of C- 
Hide&Seek and C-Hide&Hash is 0.59 and 0.57, respec- 
tively, while it is 0.61 for both protocols when the time- 
dependent approximation is not considered. This shows 
that this second type of approximation does not have a 
significant impact. 

Figure [T4(b)| shows the recall of our protocols. Note 
that Pierre and FriendLocator do not lead to incor- 
rect "not in proximity" answers, and hence their recall 
is equal to 1. On the contrary, our protocols can gen- 
erate incorrect "not in proximity" answers due to the 
time- dependent approximation. This chart shows that 
the recall of C-Hide&Seek and C-Hide&Hash is always 
above 0.95 and 0.9, respectively. From Figure [14 (b)| we 
can also observe that the recall increases for coarser 
granularities. This is due to the fact that less incorrect 
"not in proximity" answers are returned if a coarser 
granularity is used. While this may appear unreason- 
able, the explanation is straightforward: there is an in- 
correct "not in proximity" answer only when a buddy is 



currently in proximity (considering Figure 12(b) his lo 



cation is in the dark gray area) while the location used 
in the computation of the proximity is outside the light 
gray area. If a granularity is coarse, then the light gray 
area is large and hence incorrect "not in proximity" are 
less frequent. 



Figure 14(c) shows the accuracy for each considered 
protocol, i.e., the percentage of correct answers. Also 
in this case, the accuracy of C-Hide&Seek and C-Hide- 
&Hash is represented with a single curve, as the two 
protocols behave similarly. Comparing this figure with 



by all the protocols is much higher than the precision. 
This is due to the fact that this metric also considers 
the correct "not in proximity" answers that are usually 
the most frequent answers, since the proximity query 
area determined by the distance threshold is usually 



much smaller than the entire space. Figure 14(c) shows 



that our protocols achieve better accuracy than Pierre 
and FriendLocator when the value of the edge of the 
granularity cells is smaller than S. In particular, for our 
default values, the accuracy of both C-Hide&Seek and 
C-Hide&Hash is higher than 0.99. 



7.3 Evaluation of the system costs 

The second set of experiments evaluates the compu- 
tation and communication costs of the different proto- 
cols. For the analysis of the Pierre protocol, we used the 
NearbyFrienq^ application, developed by the same au- 
thors, which integrates the Pierre protocol in a desktop 
IM application. 

First, we consider the costs related to the location 
update sub-protocol. This analysis does not apply to 
existing solutions as location updates are only required 



by our centralized solutions. As analyzed in Section 5.4 



the temporal complexity of computing a location up- 
date is constant in the number of buddies. In our im- 
plementation, the computation of each location update 
requires, on the client side, about half of a millisecond 
for both the C-Hide&Seek and the C-Hide&Hash proto- 
cols. Similarly, the communication cost is independent 
from the number of buddies and the payload of each 
location update message consists in few bytes. Consid- 
ering the overhead caused by the XML encapsulation, 
the dimension of each location update is in the order of 
a few hundred bytes. 

The computation time needed to run a proximity 



request on the clients is shown in Figure 15(a) and 
|15(b)[ Note that the values reported in this figure only 



http : //crysp . uwaterloo . ca/sof tware/nearbyf riend/ 
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Fig. 15 Evaluation of the system costs 



consider the computation time required by the issuing 
user. Indeed, ah the protocols require the SP (in case 
of centrahzed services) or the other buddies (in case 
of distributed services) to participate in the protocol, 
and hence to perform some computation. For exam- 
ple, in the case of Hide&Crypt and Pierre, the total 
computation time of a user's buddies to answer a prox- 
imity request issued by that user is about the same 
as the computation time required to issue the request. 
As observed in Section [5] the computation time of a 
proximity request is linear in the number of buddies. 



Figure 15(a) shows that C-Hide&Hash requires signif- 



icantly more time with respect to C-Hide&Seek, espe- 
cially when the number of buddies is large. However, 
the computation time of C-Hide&Hash needed to issue 
a proximity request for 100 buddies is only about 40ms 
on the mobile device we used in our experiments. The 
figure also shows that the computation times of C-Hide- 
&Hash and Hide&Crypt are similar, with Hide&Crypt 
performing slightly better. This is due to the fact that 
in Hide&Crypt each of the sMax indexes only needs to 
be encrypted, while in C-Hide&Hash it also needs to 
be hashed. 

For what concerns other existing solutions, we did 
not implement the Pierre protocol on our mobile de- 
vice platform. However, considering the experimental 
results presented by the authors (see [3D]), the compu- 
tation time of a single proximity request with a single 



buddy is more than 350mj^ Since, for C-Hide&Hash, 
the computation time on a mobile device of a proximity 
request with a single buddy is less than 0.4ms, accord- 
ing to the data we have, our solution is more than 800 
times more efficient than the Pierre solution. 



Figure 15(b) shows the impact of the size of the cells 
of G on the computation time of a proximity request. 
As expected, this parameter does not affect the compu- 
tation time of C-Hide&Seek that is actually negligible, 
while it clearly has an impact on C-Hide&Hash. Intu- 
itively, when the cells of G are small, a large number 
of indexes needs to be encrypted and hashed and, in 
our experimental settings, the computation cost may 
grow up to almost 140 milliseconds with cells having 
edge of 100m, when considering 100 buddies. Although 
the computation time grows quadratically with the in- 
verse of the edge of a cell, we believe that using cells 
with edge smaller than 100m would not justify the em- 
ployment of this privacy preserving technique. Indeed, 
a cell with an area smaller than 100 x 100m denotes 
a very low privacy requirement, while C-Hide&Hash is 
preferable over C-Hide&Seek only when strong privacy 
is required. 

Regarding the computation costs on the server side, 
the complexity of a proximity request using C-Hide&- 
Hash on the server side is similar to the one on the 

^^ It is unclear whether this result is obtained on a mobile de- 
vice. 
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client side. However, in our experiments we observed 
that our high-end desktop machine is about 7 times 
faster than the mobile client to execute these opera- 
tions. As a consequence, the computation for a single 
user having 50 buddies requires around 3ms. While we 
did not run scalability tests on our server, this result 
suggests that, from the computational point of view, 
even a single desktop machine can provide the service 
to a large number of users. 

Figures 15(c) and |15(d)] show the system communi- 
cation cost of a proximity request issued by a user. In 



Figure 15(c) wc measure the number of messages ex- 
changed by the system for each proximity request. It 
is easily seen that using a centralized protocol (i.e., C- 
Hide&Seek and C-Hide&Hash), only two messages need 
to be exchanged (one for the request and one for the re- 
sponse) independently from the number of buddies the 
issuer has. On the contrary, the decentralized protocols 
requires at least two messages for each buddy. More- 
over, in our implementation of the Hide&Crypt proto- 
col, each communication between two users needs to 
transit through the SP. The same applies to the Pierre 
protocol, using the NearbyFriend implementation. Con- 
sequently, at each location update, for each buddy, four 
messages transit in the system: two between the issuer 
and the SP and two between the SP and the buddy. 



Figure 15(d) shows a comparison of the total amount 
of data exchanged in the system for each proximity re- 
quest. Consistently with our analysis, the communica- 
tion cost grows linearly with the number of buddies for 
both of our centralized protocols. It is easily seen that 
this also applies to the other protocols. The chart shows 
that NearbyFriend incurs in high communication costs. 
The reason is that, each time a proximity request is is- 
sued, a message of almost 3KB is sent from the user to 
each of her buddies and a message having a similar size 
is sent back in the reply. We believe that this overhead 
is mostly given by the fact that NearbyFriend needs all 
the communications between two users to be encapsu- 
lated in a secure channel. This is required because the 
Pierre protocol itself does not guarantee that any third 
party acquiring the messages cannot derive location in- 
formation about the users. Since each message between 
two users transits through the server, the communica- 
tion cost is almost 12KB for each buddy. The other 
decentralized solution we compare with, Hide&Crypt, 
has better communication costs. Indeed, each message 
is less than 1KB, and hence the cost is about 1/4 if 
compared to Pierre. 

Our centralized solutions are even more efficient. 
This is due to the fact that only two messages need 
to be exchanged between the user and the SP for each 
proximity request. In case of C-Hide&Hash, each mes- 



sage has the same dimension than in Hide&Crypt, and 
hence, in this case, the communication cost is one half 
with respect to Hide&Crypt, and about one order of 
magnitude less with respect to Pierre. Finally, C-Hide- 
&Seek, in addition to being a centralized solution, also 
benefits from the fact that each message contains only 
a few hundred of bytes. Consequently, this protocol is 
about 4 times more efficient than C-Hide&Hash. 



In Figure 15(e) we evaluate the communication cost 
of the continuous use of a proximity service with our 
protocols. As mentioned in Section [7^ we consider that 
location updates are issued every 4 minutes. Consider- 
ing the results of our user survey, we use 10 minutes as 
the average frequency of proximity requests. The main 
difference of this figure with respect to Figure 15(d) is 



that it also considers the communication costs derived 
by the location updates. However, since each location 
update costs less than 300 bytes, and 15 location up- 
dates need to be issued in one hour, the total hourly 
cost for this sub-protocol is about 4KB, which is neg- 
ligible with respect to the communication cost of the 
proximity requests. The figure also shows that the cen- 
tralized protocols require significantly less communica- 
tion than the decentralized ones. In particular, C-Hide- 
&Seek for one hour requires around 100KB when the 
user has 50 online buddies. C-Hide&Hash, on the other 
side, requires less than 500KB per hour for the same 
number of buddies. We believe that this cost is largely 
sustainable on a wireless broadband network (e.g., 3G), 
and that, given the additional privacy with respect to 
curious buddies achieved using C-Hide&Hash, privacy 
concerned users may find this trade-off attractive. 

Our experimental evaluation also included the mea- 
surement of the cost to distribute the private key (see 
Section pi) . Both the computation and communication 
costs are linear in the number of buddies that need to 
receive the new key. For a single buddy, the computa- 
tion time is about 7ms, measured on the mobile device, 
while the communication cost is less than 200 bytes. An 
experiment of key distribution to 50 buddies, resulted 
in a computation time of 340 ms, and a communication 
cost of less than 9KB. 



7.4 Evaluation of the achieved privacy 

In Section |5] we proved that both of our protocols guar- 
antee the users' privacy requirements. We also observed 
that that C-Hide&Hash provides more privacy than 
what would be strictly necessary to guarantee the re- 
quirements. In this last set of experiments we evaluate 
how much additional privacy is provided by C-Hide&- 
Hash in terms of the size of the uncertainty region. We 
recall that this is the area where a user A is possibly 
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located as it can be computed by one of A's buddies 
after issuing a proximity request that returns A as in 
proximity. 

Figure [16] shows that the privacy provided by C- 
Hide&Hash is always significantly larger than the pri- 
vacy requirement, and it grows for coarser granularities 
G. Intuitively, with C-Hide&Hash, the uncertainty re- 
gion corresponds to the union of the light and dark gray 
areas represented in Figure |12(b)[ Consequently, as the 
size of the cells of G decreases, the size of the light gray 
area tends to zero, and the uncertainty region becomes 
closer and closer to the dark gray area only. This means 
that the privacy provided by C-Hide&Hash is at least 
ttS'^ even when the user requires her location to be ob- 
fuscated in a smaller area. 
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8 Discussion and conclusions 

We presented a comprehensive study of the location 
privacy problem associated with the use of a proxim- 
ity service as an important component of any geo-social 
network. We illustrated two new protocols to compute 
users' proximity that take advantage of the presence of 
a third party to reduce the computation and commu- 
nication costs with respect to decentralized solutions. 
We formally proved that the service provider acting as 
the third party, by running the protocol, cannot acquire 
any new location information about the users, not even 
in presence of a-priori knowledge of users' locations. We 
also showed that each user can have full control of the 
location information acquired by her buddies. Exten- 
sive experimental work and a complete implementation 
illustrate the benefits of the proposed solutions with 
respect to existing ones, as well as their actual applica- 
bility. 

The two centralized solutions we propose require 
each user to share keys with her buddies, and hence 
are not the most appropriate to be used in a "query 



driven" service (e.g., finding people meeting certain cri- 
teria) . The decentralized versions of the two presented 
protocols are more suitable in this case [T5]. 

An interesting direction we plan to investigate is 
to extend the adversary models we considered in this 
paper to include not only (atemporal) a-priori location 
knowledge, but also time- dependent location knowledge. 
This would model not only a-priori knowledge about 
velocity, that our solutions can already deal with, but 
also a-priori probabilistic proximity information. It is 
still unclear if the proposed protocols, with appropriate 
location update strategies, similar to those discussed in 
Section [^4} need to be modified in order to be proven 
privacy-preserving according to our definitions. 

An interesting extension of our protocols is to al- 
low users to specify different privacy preferences with 
respect to different groups of buddies. This is not dif- 
ficult, but it exposes the users to dangerous collusion 
attacks if further constraints are not imposed. The pre- 
sented protocols are not subject to buddies' collusion 
attacks since each user defines the same granularity as 
privacy preference with respect to all of her buddies. If 
this is not the case, a user A, by assigning two different 
granularities with respect to buddies B and C to reflect 
her different level of trust, would expect that if i? and C 
collude the lowest privacy requirement among the two 
is preserved. However, an adversary could actually in- 
tersect the uncertainty regions and potentially violate 
both privacy requirements. In order for our protocols 
to defend against such a collusion, some relationships 
need to be imposed on the granularities used in the sys- 
tem. While details are out of the scope of this paper, 
intuitively, granules from different granularities should 
never partially overlap. For example, using hierarchical 
grids as granularities would be a sufficient condition. 

Finally, our solution is limited to location privacy 
and it does not enforce anonymity since the participat- 
ing buddies often know each other. We do not exclude 
that for some proximity services anonymity would be 
desirable in order, for example, to hide to the SP the 
real identity of buddies or each user's list of buddies. 
However, since the techniques proposed in this paper 
guarantee that no location information is disclosed to 
the SP, it would not be difficult to adapt them to pro- 
vide anonymity with respect to the SP by applying, for 
example, existing anonymization techniques for stan- 
dard (i.e., non spatio-temporal) datasets. 
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A Proofs of formal results 

A.l Proof of Lemma [T] 

Proof The sought after independence intuitively means that whether 
an update/request is sent to SP by a user A is not related to 
where the user is located. Formally, by the definition of condi- 
tional probability, we have 

P{ur{A)\locA,priA) 
= P{ur{A),locA\priA)/ P{locA\priA) 
= (P{ur{A)\priA) * P{locA\priA))/ P{locA\priA) 
= P{ur{A)\priA)- 

The second equality is due to the protocol, in which an up- 
date/request is sent at fixed time intervals for each user inde- 
pendent of the user's location. Hence, the lemma follows. 



Consider the case of two messages in M, i.e.. 
have: 



2. Now we 



A. 2 Proof of Lemma [2] 

Proof All we need is 

P{Mi\M2,locA,priA) = P{Mi\locA,priA), 

i.e., the knowledge of the messages in M2 does not have any 
impact on the probability of messages in Mi. But this follows 
the perfect secrecy assumption and the use of keystreams in our 
protocol. 



A.3 Proof of Theorem [T] 

Proof We prove the theorem by showing that for each set M 
of messages exchanged during the protocol, we have P{postA) = 
P{priA)- That is, the messages M do not change the 5P's knowl- 
edge of A's location. By assumption of the theorem, P{postA) = 
P{locA\M,priA) as the only knowledge is M and priA- The 
knowledge that Ioca S Qa is useless as we assume in this case 
that gA is the whole spatial domain. By the definition of condi- 
tional probability, we have 

P{locA\M,priA) = 

P{M\locA,priA) * P{locA\priA)/P{M\priA)- 
It now suffices to show 



P{M\locA,priA) = P{M\priA)- 



(2) 



Intuitively, Equation[2]says that the messages M are independent 
of the location of A. This follows from two observations; the first 
is that the issuance of messages does not depend on the location 
of A by Lemmallland the second is that the (encrypted) messages 
are independent of the content of the messages by Lemma[2] More 
formally, assume 

M = mi, . . . , m„. 

Let ur(M) be the messages of the form 

ur(mi), . . . , ur{m„), 

where ur{mi) is "an update/request is sent by user Bi" . That 
is, ur{mi) disregards the encrypted part of the message but only 
says that a message is sent and by whom. By perfect secrecy 
assumption, the probability of a particular (single) message is 
the same as any other (single) message that differs only in the 
encrypted part, and hence the same as the probability of urimi). 



P (M\locA,priA) 

= P{mi,m2\locA,priA) 

= P{mi\m2, locA,priA) * P{^^2\iocA,priA) 

= P(mi\locA,priA) * P('m.2\locA,priA) by Lemma[2] 

= P(ur{mi)\locA,P'''iA) * P(ur(m2)\locA,priA) 

by the above discussion 
= P(ur(mi),ur(m2)\locA,priA) by Lemma|2] 
= P(ur{M)\locA,priA) 

The above can be extended to n messages in M and also to show 
the equation P{M\priA) = P{ur{M)\priA)- Hence, 



P {M\locA,priA) 
= P(ur(M)\locA,pJ'iA) 
= P{ur(M)\priA) by LemmalT] 
= P{M\priA) 
and the thesis is established. 



A.4 Proof of Theorem [2] 

Proof Given a buddy B, we prove the theorem by showing that 
for each set M of messages exchanged during the protocol, we 
have 

P{locA\M,priA,locA e gA) = P{locA\priA,locA £ g^), 

where A is another user, and gA is the location information that 
is encrypted in the messages of A with the key shared between A 
and B. In other words, we want to show that B will not acquire 
more location information about A through the messages other 
than what B already knows. Intuitively, this is true since the 
location information revealed by A is only at the granule level, 
but not where within the granule. 

The formal proof is the same as for Theorem IT] but with 
the following two changes: (1) urim) represents that request was 
sent from the granule included in the message if the message is 
intended to B\ otherwise, it is the same as before. (2) Ioca S gA 
is included in priA , or equivalently we replace each occurrence of 
priA with ^^locA G QA^P^'^A^ ■ Let us now examine the steps in 
the proof of Theorem IT] 

Lemma [1] still holds since updates/requests are sent regard- 
less of locations if the user who sent the message is C ^ A. If 
C = A, then the ur{A) gives the location (the granule) where the 
message is sent. In this case, the location is totally dependent on 
the given information of Ioca , Ioca S gA a^nd priA ■ Note that I 
is an index of a granule, any information contained in Ioca a^nd 
priA below the granule level is not relevant to the probability of 
a message. 

For Lemma [2] the content in M2 still does not have any im- 
pact on the content in Mi even when B can decrypt the messages 
intended to him as there is no information (from priA, Ioca, and 
locA £ 9a) that restricts any possible content in Mi, so the condi- 
tional probability of Mi does not change regardless the existence 
of M2. 

For the discussion regarding the probability of m; and ur{mi), 
with the addition of Ioca G 9 A, wg still have that the conditional 
probability of nii being the same as that of ur(mi). Indeed, as- 
sume 

mi = {C, ui, Ej^ui (l)). 
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If C ^ A, then all messages of the type have the same probability 
with or without knowing A's location since Cs location informa- 
tion is not assumed in the conditional probability. This case is 
exactly the same as for the SP and the conditional probability of 
mi is the same as that of urirrii). \i C = A, since B can decrypt 
the message, hence knowing the location I in the message, this 
location I (an index value of a granule in Ga) needs to be con- 
sistent with the location knowledge in loCj\ and pri^'. if it is not 
consistent, then the probability of the message is zero; otherwise, 
the probability is totally dependent on the probability of A being 
in GA(i) given Ioca, Ioca S ffAi and priA- But the same can be 
said about ur{mi) (which says that a message was sent at the 
given location), i.e., the probability of ur(mi) depends totally on 
loCA, locA S QAt and priA- Therefore, nii and ur{Tni) have the 
same conditional probability. By the same reasoning as in the 
proof of Theoremll] ur(M) has the same conditional probability 
as M. 

With all the above discussions, the theorem is established. 



key which is randomly generated by the SP {K2 in the proto- 
col). Therefore, this value is probabilistically independent of the 
location of A. In this case, based on the protocol, the only infor- 
mation B obtains is whether A is in a granule among the ones 
given by B. This needs to be consistent with the location infor- 
mation contained in Ioca and priA- If not, then the probability 
of this message is zero, and otherwise the probability is totally 
dependent on Ioca and priA as no other information is available. 
The thesis follows the above discussions in the same style as the 
proof of Theorem [2] 



A.5 Proof of Theorem [3] 

Proof The proof follows the same style of that for Theorem IT] 
That is, we show P{M\locA,pTiA) = -P(A^|p^*,4)i i-e., the loca- 
tion of A does not change the probability of messages M condi- 
tioned on priA ■ Like for Theorem [2] wo examine the proof steps 
of Theorem IT] for the purpose of the current thesis. Lemmas IT] 
and p] both hold due to the use of hashing function that displays 
stronger secrecy than encryption. The important difference is the 
discussion of the conditional probabilities of m and ur{m). If m 
is an update, then the same applies as in the proof of Theorem^ 
The difference is when m is a proximity request. In this case, 
the message contains multiple components. The critical step is to 
show that all such messages have the same conditional probability 
(to the SP) and hence the same as the conditional probability of 
urirn). This is not difficult since the location information in the 
condition is opaque to the SP. This opaqueness is given by two 
facts. The first is that the number of components in the message 
is the same regardless of the location information. The second is 
that the indexes of the granules and the "padding" (5" in the 
protocol) in the message components are hashed and hence to 
the SP all possible granule indexes are equally possible in the en- 
crypted (by Ki in the protocol) message. (Here, hashing before 
encryption with K\ is important as the adversary cannot attack 
using known pattern of the plaintext.) The above observations 
lead to the thesis of this theorem. 



A.6 Proof of Theoremll 



Proof Intuitively, to the buddies, the C-Hide&Hash is much stronger 
than C-Hide&Seek since buddies only share a hashing function 
and the buddies location information is encrypted by a random 
key (generated by the SP) before sending to the requesting user 
B. Formally, the proof follows the same style as that for Theo- 
rem [2] The only difference is what it means when a message is 
"consistent" with the location knowledge. In this case, from B's 
perspective, we need to define ur{m) to be the binary random 
variable that "the user is in one of the requesting granules or 
not" for the message sent back from the SP (as the reply to a 
proximity request from B). After B requesting proximity, B will 
receive a message from the SP with the encrypted hash value of 
j4's location (in addition to the "kick back" from the SP in the 
form of encrypted values that B sent to the SP). Even though B 
and A shares the hash function, B does not know the encryption 



